Data Visualisation and Exploration with Prior Knowledge

نویسندگان

  • Martin Schroeder
  • Dan Cornford
  • Ian T. Nabney
چکیده

Visualising data for exploratory analysis is a major challenge in many applications where there is a need to gain insight into the structure and distribution of the data (e.g. to find common patterns and to identify relationships between samples as well as variables). Typically, visualisationmethods like principal components analysis (PCA) and multi-dimensional scaling (MDS) are employed. These methods are favoured because of their simplicity but it is difficult to incorporate prior knowledge about properties of the variable space into the analysis (particularly important where strong correlations are present) or to cope with missing data. One way to benefit from highly correlated variables is to model them using a block correlation matrix; this reduces the number of free parameters significantly and also captures the structural knowledge. In this way, noise on the correlated variables is modelled with a common parameter. In this paper we show how to utilise on such information by using a modification of a well known non-linear probabilistic visualisation model, Generative Topographic Mapping (GTM). The model has the advantage it can cope with missing data, which is particularly valuable in high-dimensional sparse datasets. In this paper it is shown that by including prior information about the grouping of variables in the covariance structure into the model one can improve both the data visualisation and the model fit. These benefits will be demonstrated on artificial data as well as a real geochemical dataset used for oil exploration, where the modification improved the imputation results by 3 to 13 %.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Interactive Data Visualisation Approach for Next Generation Presentation Tools - Towards Rich Presentation-based Data Exploration and Storytelling

Existing research in the field of information visualisation has shown that interactive data exploration and storytelling can significantly improve the extraction and transfer of knowledge from raw data. Established visualisation techniques help viewers to strengthen their mental model and improve the understanding of the underlying data. However, these techniques are not yet manifested in slide...

متن کامل

Interactive and Narrative Data Visualisation for Presentation-based Knowledge Transfer

In recent years, presentation tools such as Apple’s Keynote or Microsoft PowerPoint play an important role in knowledge transfer. Despite the fact that over the last decade we have witnessed various technological advances and new media types, existing presentation tools still mainly support the presenter-driven delivery of static content. On the other hand, research in information visualisation...

متن کامل

Visual Exploration of Complex Network Data Using Affective Brain-Computer Interface

This paper describes the current state of the work aimed towards an affective application of BCI to the task of complex data visual exploration. The developed technological approach exploits the idea of supporting tacit and complex domain-specific knowledge acquisition during the examination of visual images built using large input data sets. The presented experimental research on the complex n...

متن کامل

Meld : a pattern supported methodology for visualisation design

The quantity and complexity of data that users are being exposed to is increasing. This is especially evident in domains such as the World Wide Web, software engineering, medical systems, etc. There is a need for the user to be supported in the analysis of this data, which may incorporate tasks such as data exploration, navigation, and manipulation. Visualisation is an area that can provide too...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009